Search for: All records

Creators/Authors contains: "Baker, Ryan S"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Investigating Algorithmic Bias in Affect Detectors with Constructed Categories of Student Identity

Nasiar, Nidhi; Belitz, Clara; Lee, HaeJin; Stinar, Frank; Baker, Ryan S; Ocumpaugh, Jaclyn; Fancsali, Stephen E; Ritter, Steve; Bosch, Nigel (December 2025, Asia-Pacific Society for Computers in Education)

Algorithmic bias research often evaluates models in terms of traditional demographic categories (e.g., U.S. Census), but these categories may not capture nuanced, context-dependent identities relevant to learning. This study evaluates four affect detectors (boredom, confusion, engaged concentration, and frustration) developed for an adaptive math learning system. Metrics for algorithmic fairness (AUC, weighted F1, MADD) show subgroup differences across several categories that emerged from a free-response social identity survey (Twenty Statements Test; TST), including both those that mirror demographic categories (i.e., race and gender) as well as novel categories (i.e., Learner Identity, Interpersonal Style, and Sense of Competence). For demographic categories, the confusion detector performs better for boys than for girls and underperforms for West African students. Among novel categories, biases are found related to learner identity (boredom, engaged concentration, and confusion) and interpersonal style (confusion), but not for sense of competence. Results highlight the importance of using contextually grounded social identities to evaluate bias.
more » « less
Full Text Available
Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs

https://doi.org/10.1145/3706468.3706501

Scarlatos, Alexander; Baker, Ryan S; Lan, Andrew (March 2025, ACM)

Full Text Available
Do MOOC Conversations Matter? Investigating the Role of Social Presence and Course-Relevant Discussion in Career Advancement

https://doi.org/10.1145/3698205.3733930

Mehta, Shruti; Srivastava, Namrata; Liu, Xiner; Vanacore, Kirk; Baker, Ryan S (July 2025, ACM)

While MOOCs have been widely studied in terms of student engagement and academic performance, the extent to which engagement within MOOCs predict career advancement remains underexplored. Building on prior work, this study investigates how participation in discussion forums, specifically social presence and the use of course-relevant keywords, affects career advancement. Using GPT-assisted content analysis of forum posts, we assess how these engagement factors relate to both achievement during the course and post-course career advancement. Our findings indicate that social presence and use of course-relevant keywords has a positive relationship with course achievement during the MOOC. However, no significant relationship was found between career advancement and either social presence or course-related keywords in discussion forums. These findings suggest that while active engagement in MOOC discussion forums enhances academic achievement, it might not directly translate into career advancement, highlighting a possible disconnect between learning participation in MOOCS and professional outcomes.
more » « less
Full Text Available
Qualitative Coding with GPT-4: Where it Works Better

https://doi.org/10.18608/jla.2025.8575

Liu, Xiner; Zambrano, Andres Felipe; Baker, Ryan S; Barany, Amanda; Ocumpaugh, Jaclyn; Zhang, Jiayi; Pankiewicz, Maciej; Nasiar, Nidhi; Wei, Zhanlan (March 2025, Journal of Learning Analytics)

This study explores the potential of the large language model GPT-4 as an automated tool for qualitative data analysis by educational researchers, exploring which techniques are most successful for different types of constructs. Specifically, we assess three different prompt engineering strategies — Zero-shot, Few-shot, and Few-shot with contextual information — as well as the use of embeddings. We do so in the context of qualitatively coding three distinct educational datasets: Algebra I semi-personalized tutoring session transcripts, student observations in a game-based learning environment, and debugging behaviours in an introductory programming course. We evaluated the performance of each approach based on its inter-rater agreement with human coders and explored how different methods vary in effectiveness depending on a construct’s degree of clarity, concreteness, objectivity, granularity, and specificity. Our findings suggest that while GPT-4 can code a broad range of constructs, no single method consistently outperforms the others, and the selection of a particular method should be tailored to the specific properties of the construct and context being analyzed. We also found that GPT-4 has the most difficulty with the same constructs than human coders find more difficult to reach inter-rater reliability on.
more » « less
Full Text Available
The Half-Life of Epistemic Emotions: How Motivation Influences Affective Chronometry

https://doi.org/10.5281/zenodo.15870172

Zambrano, Andres Felipe; Ocumpaugh, Jaclyn; Baker, Ryan S; Vanacore, Kirk; Esiason, Jordan; Vandenberg, Jessica (January 2025, International Educational Data Mining Society)
Mills, Caitlin; Alexandron, Giora; Taibi, Davide; Lo_Bosco, Giosuè; Paquette, Luc (Ed.)
Research on epistemic emotions has often focused on how students transition between affective states (e.g., affect dynamics). More recently, studies have examined the properties of cases where a student remains in the same affective state over time, finding that the duration of a student's affective state is important for multiple learning outcomes. However, the likelihood of remaining in a given affective state has not been widely studied across different methods or systems. Additionally, the role of motivational factors in the persistence or decay of affective states remains underexplored. This study builds on two prior investigations into the exponential decay of epistemic emotions, expanding the analysis of affective chronometry by incorporating two detection methods based on student self-reports and trained observer labels in a game-based learning environment. We also examine the relationship between motivational measures and affective decay. Our findings indicate that boredom exhibits the slowest decay across both detection methods, while confusion is the least persistent. Furthermore, we found that higher situational interest and self-efficacy are associated with greater persistence in engaged concentration, as identified by both detection methods. This work provides novel insights into how motivational factors shape affective chronometry, contributing to a deeper understanding of the temporal dynamics of epistemic emotions.
more » « less
Full Text Available
Exploring Student Identity in Adaptive Learning Systems Through Qualitative Data

https://doi.org/10.1007/978-3-031-98462-4_45

Belitz, Clara; Lee, HaeJin; Nasiar, Nidhi; Fancsali, Stephen E; Stinar, Frank; Almoubayyed, Husni; Ritter, Steve; Baker, Ryan S; Ocumpaugh, Jaclyn; Bosch, Nigel (July 2025, Springer Nature Switzerland)

Adaptive learning systems are increasingly common in U.S. classrooms, but it is not yet clear whether their positive impacts are realized equally across all students. This study explores whether nuanced identity categories from open-ended self-reported data are associated with outcomes in an adaptive learning system for secondary mathematics. As a measure of impact of these social identity data, we correlate student responses for 3 categories: race and ethnicity, gender, and learning identity—a category combining student status and orientation toward learning—and total lessons completed in an adaptive learning system over one academic year. Results show the value of emergent and novel identity categories when measuring student outcomes, as learning identity was positively correlated with mathematics outcomes across two statistical tests.
more » « less
Full Text Available
Identifying When and Why Students Choose to Quit Jobs in a Science Exploration Game

https://doi.org/10.1007/978-3-031-74138-8_5

Liu, Xiner; Slater, Stefan; Swanson, Luke; Metcalf, Shari J; Gagnon, David J; Baker, Ryan S (October 2024, Springer Nature Switzerland)

Students in open-ended educational games have a number of different pathways that they can select to work productively through a learning activity. Educators and system designers may want to know which of these pathways are most effective for engagement, learning, or other desirable outcomes. In this paper, we investigate which prior jobs and factors are associated with higher rates of student quitting behavior in an educational science exploration game. We use a series of Chi squared analyses to identify the jobs with the highest rates of quitting overall, and we calculate logistic regressions within specific jobs to determine the potential factors that lead to students quitting those jobs. Our analysis revealed that for 23 of the 40 jobs examined, having experience in at least one previous job significantly decreased the chances of students quitting the subsequent job, and that completing specific prior jobs reduces quit rates on specific later jobs. In our discussion, we describe the challenges associated with modeling quitting behavior, and how these analyses could be used to better optimize students’ pathways through the game environment. Specially, guiding students through specific sequences of preliminary jobs before tackling more challenging jobs can improve their engagement and reduce dropout rates, thus optimizing their learning pathways.
more » « less
Full Text Available
Using a Multi-Dimensional Model of Gender to Assess Learning with Different Game-Based Learning Narratives

Stec, Hayden; Richey, J Elizabeth; Nguyen, Huy; Else-Quest, Nicole; Hammer, Jessica; Baker, Ryan S; Arroyo, Ivon; McLaren, Bruce (October 2024, Play Story Press)

Digital learning games can help address gender disparities in math by promoting better learning experiences and outcomes for girls. However, there is a need for more research to understand why some digital learning games might be especially effective for girls studying mathematics. In this study, we assess two possible pathways: that girls might benefit from math games because they reduce the anxiety and evaluation apprehension that girls are more likely to experience when doing math; and that girls might benefit from math games when they enjoy the narrative and thus experience greater engagement. To evaluate these pathways, our work uses multiple dimensions of gender (e.g., gender identity and gender-typed interests, activities, and traits) and surveys of affective experiences to examine the impact of three learning systems with identical learning content: a digital learning game, Decimal Point, that has consistently led to better learning for girls over boys; a new masculine-typed game, Ocean Adventure, developed based on a survey of over 300 students; and a conventional tutoring system. We predicted that girls and students with stronger feminine-typed characteristics would experience less math anxiety in both Decimal Point and Ocean Adventure compared to the tutor. We also predicted that girls and students with stronger feminine-typed characteristics would experience greater engagement and learning with Decimal Point while boys and students with stronger masculine-typed characteristics would experience greater engagement and learning with Ocean Adventure. Consistent with predictions, students with stronger feminine-typed characteristics experienced less anxiety and evaluation apprehension in both games compared to the tutor. This suggests that math learning games may provide a way to address these negative affective experiences. In terms of our measures of engagement, we found that students with stronger masculine-typed characteristics reported greater experience of mastery in the masculine Ocean Adventure; however, this was the only indicator that the more masculine narrative of Ocean Adventure led to different experiences based on gender. This suggests that narrative alone may not have a strong enough effect on students based on gender, especially when other game features are kept constant. Contrary to our predictions, there were no effects of gender identity or condition on learning outcomes, although both masculine-typed and feminine-typed characteristics were negatively associated with learning. Overall, these results point to the value of a multi-dimensional model of gender in assessing learning with a game, the important role learning games can have in reducing math anxiety and evaluation apprehension for girls and students with feminine-typed characteristics, and the nuanced effects of game narratives on experiences with game-based learning.
more » « less
Full Text Available
Detecting Unsuccessful Students in Cybersecurity Exercises in Two Different Learning Environments

https://doi.org/10.1109/FIE61694.2024.10893135

Švábenský, Valdemar; Tkáčik, Kristián; Birdwell, Aubrey; Weiss, Richard; Baker, Ryan S; Čeleda, Pavel; Vykopal, Jan; Mache, Jens; Chattopadhyay, Ankur (October 2024, Proceedings)

This paper evaluates the use of data logged from cybersecurity exercises in order to predict which students are potentially at risk of performing poorly. Hands-on exercises are essential for learning since they enable students to practice their skills. In cybersecurity, hands-on exercises are often complex and require knowledge of many topics. Therefore, students may miss solutions due to gaps in their knowledge and become frustrated, which impedes their learning. Targeted aid by the instructor helps, but since the instructor’s time is limited, efficient ways to detect struggling students are needed. This paper develops automated tools to predict when a student is having diffculty. We formed a dataset with the actions of 313 students from two countries and two learning environments: KYPO CRP and EDURange. These data are used in machine learning algorithms to predict the success of students in exercises deployed in these environments. After extracting features from the data, we trained and cross-validated eight classifiers for predicting the exercise outcome and evaluated their predictive power. The contribution of this paper is comparing two approaches to feature engineering, modeling, and classification performance on data from two learning environments. Using the features from either learning environment, we were able to detect and distinguish between successful and struggling students. A decision tree classifier achieved the highest balanced accuracy and sensitivity with data from both learning environments. The results show that activity data from cybersecurity exercises are suitable for predicting student success. In a potential application, such models can aid instructors in detecting struggling students and providing targeted help. We publish data and code for building these models so that others can adopt or adapt them.
more » « less
Full Text Available
From Reaction to Anticipation: Predicting Future Affect

https://doi.org/10.5281/zenodo.12729885

Zambrano, Andres Felipe; Baker, Ryan S; Baral, Sami; Heffernan, Neil T; Lan, Andrew (July 2024, International Educational Data Mining Society)
Benjamin, Paaßen; Carrie, Demmans Epp (Ed.)
The educational data mining community has extensively investigated affect detection in learning platforms, finding associations between affective states and a wide range of learning outcomes. Based on these insights, several studies have used affect detectors to create interventions tailored to respond to when students are bored, confused, or frustrated. However, these detector-based interventions have depended on detecting affect when it occurs and therefore inherently respond to affective states after they have begun. This might not always be soon enough to avoid a negative experience for the student. In this paper, we aim to predict students' affective states in advance. Within our approach, we attempt to determine the maximum prediction window where detector performance remains sufficiently high, documenting the decay in performance when this prediction horizon is increased. Our results indicate that it is possible to predict confusion, frustration, and boredom in advance with performance over chance for prediction horizons of 120, 40, and 50 seconds, respectively. These findings open the door to designing more timely interventions.
more » « less
Full Text Available

« Prev Next »